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Abstract 

The concept of fitness is introduced, and a simple derivation of 
the Fundamental Theorem of Natural Selection (which states that the 
average fitness of a population increases if its variance is nonzero) is 
given. After a short discussion of the adaptative walk model, a short 
review is given of the quasispecies approach to molecular evolution 
and to the error threshold. The relevance of flat fitness landscapes to 
molecular evolution is stressed. Finally a few examples which involve 
wider concepts of fitness, and in particular two-level selection, are 
shortly reviewed. 

Fitness and the fundamental theorem of 
natural selection 



The term "fitness" derives from the phrase "survival of the fittest" that 
the philosopher Herbert Spencer suggested to use instead of "natural se- 
lection". In the struggle made by the evolutionary theorists to avoid the 
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tautology lurking in the phrase, the term has been twisted to several mean- 
ings. R. Dawkins |J distinguishes no less than five different meanings to the 
word in the evolutionary literature. From the point of view of model build- 
ing, the most convenient meaning — and the one we shall adopt — is however 
the following: 

The fitness of an individual is proportional to the average number 
of offspring it may have in the given environment. 

In this definition, fitness is assigned to individuals rather that to genes or to 
groups of individuals. It is further assumed that reproduction takes place via 
a stochastic process, and that, in a given population, the average numbers of 
(immediate) offspring of two individuals have the same ratio as their fitnesses: 
therefore, only ratios of fitnesses have a well-defined meaning, and not their 
absolute value. 

Let us consider a population formed by a certain number of individuals, 
whose inheritable characteristics (genotype) are summarized by the variable 
a. Let us further assume that the population reproduces asexually, that the 
offspring of an individual have the same genotype as the parent, and finally 
that the number of offspring is exactly proportional to the fitness of the 
parent: briefly, let us neglect mutations in the genotype and fluctuations in 
the number of offspring. 

We can thus write down an equation expressing the number n t {o~) of indi- 
viduals carrying the genotype a at generation t+1, given the same quantity 
at generation t, assuming that the fitness A (a) of an individual is a function 
only of its genotype: 

n t+1 (a) = —A(a)n t (a), (1) 

where Z t is a proportionality constant. In order to simplify the argument 
we have also assumed that the generations are nonoverlapping, i.e., that al 
individuals, once reproduced, die. 

The total number A/j of individuals in the population at generation t is 
given by 

M = £"*(*)■ (2) 
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We define the population average (Q) t at generation t of a quantity Q(o~), 
which depends only on the genotype a, in the following way: 



(Q)t = ^E^KW- (3) 

We can now prove that the average fitness (A) always increases, unless 
all individuals have the same fitness. We have in fact: 

(A) t+1 = t^E^K + i(^) = ttVE^VKM- (4) 

On the other hand, one has 

M + i = ^E^K(^) = #(A- (5) 

A o- A 

Therefore 

(^i(4 = (4^^ a - (6) 

and the equality holds only if all individuals in the population have the same 
fitness. In fact, the larger the variance in the fitness, the faster its average 
grows. 

This result is a simplified version of the Fundamental Theorem of Natural 
Selection due to R. Fisher [[U| p. 22ff]. Some authors have considered it as 
the key point of difference between the living and the inorganic world. As 
K. Sigmund puts it |32], p. 108]: 

So we see, in physics, disorder growing inexorably in systems iso- 
lated from their surroundings; and in biology, fitness increasing 
steadily in populations struggling for life. Ascent here and degra- 
dation there — almost too good to be true. 

In fact, the result depends on many unrealistic assumptions. Let alone 
the complications introduced by sex, which lead to maddeningly complex 
behavior, let us focus on the effects of mutation: on that set of causes which 
makes offspring different from their parent, even among bacteria. 

We all know that genetic information is carried by the DNA, in the form 
of a sequence of nucleotide bases, which belong to four different types: A 
adenine and G guanine (purines); T thymine and C cytosine (pyrimidines) . 
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In the double helix of DNA they are found in matching pairs: A-T and 
G-C. During the replication, it may happen that the replication mechanism, 
which associates one of the "old" strands to the "new" ones, stumbles in 
some errors. These errors can be divided in a few classes: 

Point mutations: Substitution of one nucleotide base to another. They 
can be divided into two classes: 

Transitions (the most common): substitution of one purine by the 
other, or of one pyrimidine by the other; 

Transversions, in which a purine is replaced by a pyrimidine and vicev- 
ersa. 

Insertions and deletions: They correspond to the introduction of new 
bases in the strand or in their omission respectively. In the case of 
sequences coding for a protein, these mutations are often fatal, since 
they entail a frame shift in the translation into proteins, unless they 
occur by threes. 

Major rearrangements: In this class one considers the insertions (or dele- 
tion) of comparatively long sequences. This is the case, e.g., of the 
transposable elements which are known to move easily from one place 
to another in the genotype. A subclass of special interest is gene dou- 
bling. 

These processes do not have the same probability. If one considers two 
genotypes, i.e., two different nucleotide sequences in DNA, one may introduce 
a notion of distance between them by considering the probability of the most 
likely mutation path connecting them. This notion of distance (metrics) has 
a rather immediate evolutionary meaning, but is most often quite difficult to 
compute. For the sake of definiteness I shall consider in the following only 
point mutations, and I shall assign the same probability to transitions and 
to transversions. In this case all DNA sequences which may be connected to 
each other have the same length, and their distance is equal to the number 
of points in the sequence in which different bases are found: this is known as 
the Hamming distance. 

To summarize: we have defined a genotype space as the space of all se- 
quences a of a given length which can be built with the four-letter alpha- 
bet ATGC. This space is endowed with a metrics, defined by the Hamming 
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distance, i.e., by the number of corresponding positions in the sequence in 
which different nucleotide pairs are encountered. If we now assign to each 
such sequence its fitness value (assuming that the fitness of an individual is 
a function only of its genotype), we obtain a fitness landscape. The phrase 
goes back to S. Wright [0, but the concept can already be found in Fisher's 
work in the 'twenties. 

The Fundamental Theorem therefore implies that populations move on 
fitness landscapes striving to climb up their peaks, while we are accostumed 
to physical systems rolling down the slopes towards the points of smallest 
energy. In this sense, fitness plays in evolutionary theory a role similar to 
energy in mechanics. 



2 Adapt at ive walks 

The Fundamental Theorem intimates in fact that the population rapidly 
reaches the maximum fitness of all individuals that are already present in it. 
Higher values of the fitness can only arise if there are mutations. If mutations 
are rare, one can think of a regime in which mutants arise from time to time 
and, if they correspond to higher fitness, "draw" the population to the new 
fitness value. This justifies the evolutionary model known as the Adaptative 



Walk 21 



In order to simplify the discussion, we shall consider from now on a geno- 
type written in a two-letter alphabet. The conclusions that we shall draw 
can be easily translated, in principle, in the four-letter alphabet of real life. 
We shall denote the two letters by { — 1,+1}, and describe the genotype a 
by a collection of iV binary variables (units): a = (ci, 02, ojv), where 
Oi = ±1, Vz. The space of these genotypes is the hypercube in N dimensions, 
whose 2 N vertices correspond to the genotypes, and the Hamming distance 
between genotypes a and a' is given by 

d H (o-, 0-') = i i 1 ~ a i a i) • ( 7 ) 

Z cr 

We shall also consider an equivalent measure of the similarity or dissimilarity 
of genotypes, namely the overlap q, central to the theory of spin glasses 
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and denned by 

e = ^£^ = i Jj—- (8) 

8=1 

The overlap between identical genotypes is equal to 1, and it decreases as the 
Hamming distance increases. Two completely independent genotypes will be 
different, on average, in half of their units, and the corresponding overlap 
will be close to zero. 



We can now define the Adaptative Walk model [22], p. 39-40]. To 
each genotype a is associated its fitness A(a). One assumes that the popu- 
lation is characterized by a single genotype a(t) at each generation t. The 
initial genotype <r(0) is chosen at random. Given the genotype cr(t), the next 
genotype is chosen according to the following procedure: 

(i) One changes sign to one of the units of the genotype cr(t), chosen at 
random; in other words, one chooses at random one of the N vertices 
of the hypercube closest to &(t); one thus obtains a tentative genotype 
✓(f); 

(ii) If A(a'{t)) > A(cr(t)), then a(t + 1) = a'(t); otherwise a(t + 1) = a(t). 

This procedure is reminiscent of a zero-temperature Monte-Carlo dynam- 
ics, where the Hamiltonian is a decreasing function of the fitness A(a). Evolu- 
tion is bound to finish on a local fitness maximum. More explicit predictions 
can only be made when more properties of the fitness landscape are known. 

We do not know in general the intricate conditions which determine the 
fitness of a given species as a function of its genotype: we can only expect 
the fitness landscape to be rather irregular and complicated. It has been sug- 
gested [0J to represent a given fitness landscape as a realization of a random 
function. 

A rather general class of random functions defined on the iV-dimensional 
hypercube has been introduced by B. Derrida in the context of spin-glass 
theory |7|]. It is defined by the expression 

A {p \a) = J {ii,i2,..,i P } (T h cr i2--- ( 7i P , ( 9 ) 

{h,i2,—,ip} 

where the sum runs over all different subsets of n indices, and for each such 
subset, the coefficient J{i lt i 2 ,„. } i p } is an independent, identically distributed, 
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real random variable. This model is known as the p-spin model in spin-glass 
theory. A slightly different set of random functions has been independently 
introduced by S. A. Kauffman in the context of Adaptative Walks J2T], 
p. 54-62], where it is known as the iVi^-model. 

The simplest case is of course p = 1. In this case the maximum fitness is 
reached for the single genotype a*, satisfying 

tr? = sign J<. (10) 

Moreover, since there are no local maxima but a*, it is possible to reach this 
maximum simply by flipping one unit after the other in the good direction. 
Evolution is a simple matter in this "Fujiyama landscape", as it has been 
called, because it is never necessary to undo the progress already made in 
order to go forward. 

However, as soon as we go to p > 1, thing become much more complicated. 

Two properties of the landscape are strictly related: the frequency of local 
fitness maxima (peaks), and the correlation (or its contrary, "ruggedness" ) 
of the landscape. We say that a landscape is rugged if the value A^{o) 
of the fitness changes a great deal when the genotype a changes sligtly. A 
measure of the ruggedness of the landscape is provided by the correlation 
function C(a,a') = (a) (a') , where a and a' are two different 
genotypes with overlap q, and the average [] is taken over the probability 
distribution of the coefficients J. In the "thermodynamic limit" iV — > oo, 
this quantity is equal, with probability one, to the average of the product 
A^\a) A^ p \a') taken over all genotype pairs with overlap equal to q. If this 
correlation function decays slowly with decreasing overlap q, the landscape is 
smooth; otherwise, it is rugged. The more rugged the landscape, the larger 
the frequency of local optima. 

Let us assume, with Derrida @, that the distribution of the coefficients 
J is a Gaussian of mean zero, and variance equal to Jq p\/(2N p ~ l ). One can 
then prove that the probability density P a (E) that the fitness A^ p \a) of any 
given genotype a is equal to E is also a Gaussian: 

E 2 



P„(E)= [d(A^(a)-E 



cx exp 



NJ 2 



(11) 



The properties of the landscape can be read off the joint probability distri- 
bution function 

P aa ,(E,E') = [5(A { *\a)-E)5(A {p \a')-E') 
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oc exp 



(E + E'f {E-E'f 
2NJ$(l + qP) 2NJ$(l-qP) ' 



where q = (1/N) J2i is the overlap of the two configurations. The corre- 
lation function C(a,a') then reads 

C(a,a') = [A^\a)A^(a')} = l -NJ^. (13) 

l J av £ 

It decays more and more rapidly as p increases. As p increases, therefore, the 
landscape becomes more and more "rugged" . At the same time, the number 
of extrema becomes larger and larger. Already for p — 2, which corresponds 
to the Sherrington-Kirkpatrick model of spin glasses, this number increases 
exponentially with N. Therefore, it becomes more and more likely that the 
adaptative walk, starting from an arbitrary initial genotype, ends up in a 
local fitness maximum instead of the absolute one. 

Eventually, as p — > oo one obtains, whenever \q\ < 1, 

P aa ,(E,E')~P a (E)P a ,(E'). (14) 

This is known as the "rugged landscape" limit, in which the fitnesses corre- 
sponding to different genotypes are independent random quantities. Adap- 
tative walks in this limit have been thoroughly discussed by Kauffman and 
Levin and more recently analyzed by H. Flyvbjerg and B. Lautrup [[17 



Let us therefore consider adaptative walks in a landscape in which the 
fitness a = A(a) of each different genotype a is an independent random vari- 
able, with a given probability distribution function p(a). Several important 
properties of this model can be obtained almost immediately [[H], p. 47-52]: 

• The probability that a given genotype a is a local fitness optimum is 
equal to 1/(N + 1). This can be simply evaluated in terms of the cu- 
mulative distribution function $(a) = J^ oa da , p(a'), namely, the prob- 
ability that the fitness of a given genotype is smaller than a. Calling 
P/v the probability that a given genotype is a local maximum, we have 
indeed: 

/oo ] 
dap(a) $(a) N = ——. (15) 
-oo iV T 1 

• A walk leading to a local optimum will touch on average ~ log 2 iV dif- 
ferent genotypes. In fact, since there are no correlations between the 
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value of the fitness of one genotype and that of its next (fitter) mutant- 
except, of course, that it is larger — the value of 1 — $ will be halved, 
on average, at each mutation step. The previous result tells us that the 
walk will stop when 1 — $ ~ 1/N. Calling I the length of the walk, we 
thus have 1 — 2~ e ~ 1 — 1/N, hence £ ~ log N/ log 2. 

• The expected time T needed to reach an optimum is proportional to N. 
The idea is that the waiting time at each step is inversely proportional 
to the probability that any given mutant is fitter, i.e., to 1 — $(a). Thus 
the waiting time doubles at each step. We obtain therefore, roughly 

e-i 

T~^2 fc = 2^-l = iV-l. (16) 

fc=0 

• The expected fitness a* of local optima satisfies the equation $(a*) = 
1 — 1/N. This result has an important consequence, named by Kauff- 
man "the complexity catastrophe" . As N increases, it is reasonable to 
assume that the "typical" values of a increase like some power of N. 
On the other hand, 1 — $(a) usually decreases faster than any power 
as a — > oo. Therefore, as N increases, the fitness values of the local 
optima become closer and closer to the "typical" values. 

Before going on, let us emphasize, following ||21|| , that similar results 
are expected to hold qualitatively also for adaptative walks on correlated 
(smoother) landscapes. Let us consider such a landscape, and assume that 
the correlation function C(a,a') vanishes when the Hamming distance 
dn(a,a') is larger than, say, 5. Starting from one given genotype a , after 
a certain number r of evolutionary steps the genotype o~(t) will be more than 
S away from a : its fitness will be therefore uncorrelated with o" (except, of 
course, for the fact that it is larger). Therefore, in the long run, the walk 
will resemble a walk on a rugged fitness landscape, apart from a rescaling 
of the elementary step length from 5 to one, and of the unit of time from r 
generations to one. 

One of the lessons to be taken from this result is that the adaptative 
walk framework is too narrow to allow for a high degree of adaptation, since 
the expected value of fitness of local optima is so low. In order to explain a 
higher degree of adaptation, one is led to introduce mechanisms that allow to 
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explore larger regions of genotype space. One possibility is the appearance of 
a "hopeful monster", i.e., a mutant whose genotype is further away from the 
dominant genotype than one or two mutations. Another (already suggested 
by S. Wright) is the appearance of a chain of slightly unfit mutants, one after 
the other, which may "reach out" for further fitness peaks. I do not find these 
suggestions very convincing. However, these possibilities cannot be discussed 
without a closer look at the genetic structure of evolving populations. 



3 The quasispecies approach 

There is no analytically treatable model, to my knowledge, that describes 
fully the structure of a population evolving on a nontrivial fitness landscape. 
In the Adaptative Walk model, all genetic variability within the population 
is neglected. The quasispecies model, introduced by M. Eigen in the context 
of the theory of prebiotic evolution [13, [14| , neglects fluctuations in the com- 



position of the population. In the case of nonoverlapping generations that 
we consider here for simplicity, it may be simply derived by introducing the 
effects of mutation into eq. (|l|). Let us denote by W(a <— a') the conditional 
probability that, while attempting to reproduce an individual of genotype a', 
one produces instead an individual of genotype a. Taking into account this 
effect, the equation (P for the number n t+ i(a) of individuals of genotype a 
at generation t + 1 becomes the quasispecies (QS) equation: 

nt + M = ±rJ2W(a <- a')A{a')n t {a'). (17) 

L a' 

The normalizing constant Z t must be chosen in a way to satisfy the external 
constraint imposed on the population. The simplest constraint is constant 
population size: X^n^cr) = M = const., which implies 

where we have exploited the fact that J2a W(a <— a') = 1, W. 

This equation exposes its origin in the theory of chemical reactions in 
that it neglects fluctuations in the numbers n t (a). This neglect is warranted 
when these numbers are much larger than one, which is the case when the 
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different chemical species are few in number, and the number of interacting 
molecules is very large. However, in evolutionary theory, and even in the 
RNA replication experiments discussed by W. Griiner in this meeting ppj ], 
the number of points in genotype space is much larger than the population 
size M. It is possible nevertheless to take it as a starting point, and we shall 
see that it is valid at least in a particular regime. 

It is easy to derive the explicit form of the mutation matrix W(a <— a') 
when one considers only point mutations with uniform probability [i: 

W(a <- a') = n d ^'\l - 11)"-**^. (19) 

The Fundamental Theorem is recovered in the limit fi — > 0. As soon as 
H > 0, however, the asymptotic composition of the population is dictated by a 
balance of the effects of selection and mutation not unlike the energy-entropy 
balance determining the equilibrium in thermodynamics. It is indeed possible 
to formulate the solution of the QS equation in terms of equilibrium statistical 
mechanics p3| , |36f| . The most interesting consequence of this analogy is the 
existence of a phase transition between an "ordered" (selection-dominated) 
regime and a "disordered" (mutation-dominated) one. This transition has 
been named the "error threshold" . 

Let us consider the "single-peak landscape" defined by 

v ; I A x < A Q , if a ^ a*. y ' 

The maximum fitness is reached for the isolated sequence a*, called the opti- 
mal or the master sequence. It is easy to solve numerically the QS equation 
by lumping together all sequences as a function of their Hamming distance. 
For \i = 0, we have n t (a) — > n 00 {a') as t — > oo, where n^cr*) = M and 
n oo(c) = for cr ^ a*. In other words, all genotype sequences in the popula- 
tion are identical, and equal to the master sequence.. When /i > 0, the sta- 
tionary distribution is sharply peaked on the master sequence, but sequences 
within a small Hamming distance from it (close mutants) also appear with 
nonnegligible frequency. This distribution of a master sequence with its close 



mutants is called a quasispecies [[L3], [L4| . 



In this regime, the QS equation describes rather faithfully the structure 
of the population. Most of the genotypes are equal to the master sequence 
or are close to it (in terms of the Hamming distance): the frequency of these 
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genotypes is large enough for the corresponding fluctuations to be negligible. 
To be sure, further mutants appear and disappear from the population: their 
frequency is small and the relative fluctuations in n a {t) are large. However, 
they play essentially no role in the dynamics. 

As the mutation rate increases, the concentration of the master sequence 
decreases. We can locate the critical value fi* of the mutation rate where the 
master sequence concentration (as estimated from first-order perturbation 
theory) vanishes. It satisfies 

(l-^f = ^. (21) 

For \i > fi*, the population is no more "hooked" at the master sequence, and 
the QS equation predicts an almost uniform concentration of all sequences. 
Therefore /x* is a good estimate of the location of the error threshold. The 
error threshold becomes sharper and sharper as iV — > oo, provided that the 
ratio Aq/Ai increases exponentially with N. 

Beyond the error threshold, the predictions of the QS equation cannot be 
taken at face value. In the usual case in which the population size M is much 
smaller than the number of points in sequence space, 2^, it is impossible to 
reach a stationary sequence distribution with almost uniform concentration. 
One has instead a wandering cloud of sequences, whose structure is dictated 
by the reproduction-mutation mechanism, and where the effects of selection 
can be neglected to a first approximation. This regime is well described by 
the "neutral theory" due to M. Kimura pfl . It deserves a more thorough 
discussion, that is deferred to the next section. 

It is instructive to solve the QS equation in the rugged fitness landscape 



discussed above JT8|. One can identify the error threshold with a spin-glass 
transition if one assumes that the "typical" values of the fitness behave like 
exp(iV) for large N. The role of the inverse temperature is played by (3 = 
|log(/i/(l — /i)). For (3 > (3*, the population is essentially concentrated on 
a fitness optimum, while for (3 < (3* all consequences of selection disappear 
in the "thermodynamic limit" . It is therefore likely that the error threshold 
is a general feature of all generic fitness landscapes, independently of their 
ruggedness. 

The concept of the error threshold is central to the theory of prebiotic 



evolution [13, 14|. A suggested mechanism for the emergence of life is the 
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formation of complex molecules capable of self-reproduction. Given the ac- 
curacy 1 — \i of the replication mechanism, eq. (0) sets an upper bound to 
the length N of these molecules. Reasonable estimates of \x lead to values of 
iV which appear too short to be able to start up a Darwinian evolutionary 
mechanism eventually leading to the first cell. One way out of this problem is 
to assume that the necessary biological information was separated in several 
different molecules, each with an N smaller than the critical one, and related 



to one another in a structure of chemical reactions like the hypercycle |15 
or more general ones p5j [33]]. This problem may find a completely different 
solution within the theory of neutral networks expounded at this meeting by 
W. Gruner pO, M. 



4 Evolution in a flat fitness landscape 

The relative weights of mutation and selection in shaping the evolution of 
natural population has been the subject of a hot debate since the late sixties, 
when Crow and Kimura introduced the Neutral Theory of Molecular Evolu- 
tion ||, 23]. This theory was prompted by the observation that natural pop- 
ulations exhibit a much higher degree of genetic variability at the molecular 
level than was previously suspected. If selection were dominant, this would 
imply that most of the variants which are found in natural populations have 
not yet been eliminated by Natural Selection. This, however, would mean 
that actual populations have a much lower fitness than the optimal one. 

Crow and Kimura suggested instead that most molecular variants in the 
genotype have the same fitness as the most common one. They are there- 
fore selectively neutral. To be sure, there are mutants corresponding to a 
much smaller fitness that the dominant one, but they are fast eliminated by 
Natural Selection, in accordance with the Fundamental Theorem. But the 
variability that is left in the genotypes does not correspond to a measurable 
effect on the fitness, again in accordance with the Fundamental Theorem, 
which states that the fitness of all individuals (not their genotype) is the 
same at stationarity. Evolution by increasing adaptation, in this view, is 
a comparatively rare phenomenon, which has little bearing on the genetic 
structure of the populations at the molecular level. 

It becomes therefore rather interesting to describe the structure of a pop- 
ulation evolving in a flat fitness landscape, in which all genotypes have the 
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same fitness. 

The results contained in this section are a translation of the results of the 
Neutral Theory in the "spin" language which we have used so far |J. We 
consider a population of M individuals, whose genotype a a , a = 1, 2, . . . , M, 
is identified by N binary variables (units) erf = ±1, % — 1, 2, . . . , N. Given the 
genetic structure (a 1 , a 2 , . . . , a M ) at generation t, the corresponding struc- 
ture at generation t + 1 is obtained according to the following procedure: 

(i) For each individual a of the new population, one chooses, independently 
and with uniform probability, the label a' = Gt{a) of its parent among 
the M possible ones; 

(ii) The genotype a a {t + 1) is given by (erf (t + 1), . . . , o~%{t + 1)) , where 

a?(t+l)=e?(t)af(t). (22) 

In this equation, e"(i) = ±1 is for each i, a, and t, an independent 
random variable with average ef (t) = e~ 2fl . This equation defines the 
bare mutation rate fi. 

The reproduction process, by which each individual "chooses" its parent, is 
a random dynamical process applying a M-point set into itself. This process 
has been thoroughly studied by Derrida and Bessis ||, and their results have 
only to be translated into our language, to obtain results on the statistics of 
genealogies. 

Let us fix our attention, for example, on the population at a given genera- 
tion much later than the beginning of the process. Let us pick up at random 
n individuals: it is a simple matter to show that the probability 7r n that all 
n individuals have n different parents is given by 

/ 1W 2\ / n - 1\ n(n - 1) . . 

assuming n -C M. The probability p n (t) that each of the n individuals had a 
different ancestor t generations ago is obviously given by 

/ x t / n(n — l)t\ , niS 

p n (t) = 7T* ~ exp (- \ M ' j ■ (24) 

Let us now say that two individuals belong to the same t-family if they 
had the same ancestor t generations ago. The number F(t) of t-families 
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is a random variable, which changes as the population evolves. Let us fix 
again our attention on the population at a given time, and consider F(t) as 
a function of t. This number is reduced, as t — > t + 1, if the ancestors of two 
different t-families had the same parent. This happens with a probability 
equal to 1 — itpi t \ = F(t)(F(t) — 1)/(2M). We can thus write down a "mean 
field equation" for the average = F(t): 

dm = $(*)($(*)-!) 

dt 2M { ' 

The solution of this equation reads 

$(t) = (l-e-*/ 2A/ ) _1 . (26) 

Therefore, after a number of generations essentially equal to M, all individ- 
uals in the population share the same ancestor. Derrida and Bessis || have 
calculated the probability Zk(t) that F(t) = k. One can then obtain an exact 
expression for $(£): 

oo oo 

= = £(2fc-l)exp 

fc=0 k=l 

This expression agrees with the mean-field one when t <C M, yielding ~ 
2M/t, but deviates from it in the fluctuation-dominated range t > M, where 
F(t) ~ 1. It is actually possible to compute the probability distribution of the 
sizes of all t-families, obtaining the result that all possible ways of breaking 
the population of M individuals into k t-families have the same probability, 
once k is given. As a result, the sizes of t-families fluctuate wildly. 

It is indeed possible to calculate more explicitly the distribution of the 
genetic structure of the population. Let us remark first of all than in the 
infinite genotype limit iV — > oo, the genetic overlap q a/3 between two individ- 
uals a and (3 is a function of their relatedness. If the last common ancestor 
of a and f3 had existed T a @ generations before the present, one would have 
q a/3 = exp(— 4/ir Q/3 ). In fact, the genotypes of the two independent lineages 
of ancestors of the two individuals have performed two independent random 
walks on the hypercube, with an average rate of fiN steps per generation. 

Therefore the distribution function P(q) = \5(q al3 — q)\ of the overlap 
reflects the genealogical structure of the population At any given time, a 



k(k-l)t 
2M 



(27) 
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peak in P(q) represents a subpopulation of v individuals whose last common 
ancestor existed r generations ago: the location of the peak is given by 
exp(— 4/xr), while its height is proportional to v 2 . As time goes on, these 
peaks move towards zero, according to the law just stated, while their height 
fluctuates. From time to time some of the peaks disappear, and new ones 
arise from the q ~ 1 region, as new subpopulations appear. The genetic 
structure of the population is therefore a stochastic process, which evolves 
in time according to the particular realization of the mapping a — > Gt(ct) of 
each individual to its parent. One must therefore distinguish two kinds of 
averages ||: 

• The population average denoted by (...): for example, the average 
overlap Q = (q) in the population is defined by 



-l 

Q=(q) 




E 9°*, (28) 



where the sum runs over all different pairs (a, (3) of individuals in the 
population. 



The process average denoted by .... One has for example [|: 

Q = M=y^; (29) 



CP = W A 2 (9A 2 + 18A + 4) 

^ ~ W (A + l)(A + 2)(3A + l)(3A + 2)' { } 

where we have introduced the notation A = l/(4/xM). 

It is obvious from the fact that Q 2 > Q 2 that Q is itself a random quantity. 
This observation implies that the genetic structure of any given sample, even 
of a very large population, will be in general very different from the average 
one: and therefore that predictions based on a "mean field" approach, like 
the QS equations, could be rather misleading. 

The average overlap Q depends on population size M, and decreases as 
M increases. Thus genetic variability increases with increasing population 
size. In most natural populations genetic variability is much smaller than 



expected on the basis of this result | 23| . In fact, natural populations are 



16 



often the outcome of a comparatively recent "population boom" involving a 
rather small founder population. In order to reach the "equilibrium" value 
quoted above one should wait of the order of M generations, where M is 
population size. This is often too long, and the result is that the actual 
variability reflects the much smaller size of the founder population. 

It is also interesting to monitor the evolution of the average genotype 

(a) = ((a 1 ),...,(a N )). (31) 

The genetic drift of the population (a physicist would call it diffusion) is 
represented by the correlation function 

1 N 



*(*) = ^E«K, +t > ( 32 ) 



i=l 



where (. . .) t denotes the population average at generation t. The exponential 
decrease of the correlation K(t) oc exp(— 2//*|i|) defines the effective mutation 
rate fi*. A simple calculation from eq. (^) yields 

K(t)=Qexp{-Mt\)- ( 33 ) 

In our case therefore, the effective mutation rate is equal to the bare one, 
and in particular, it is independent of population size. This rather surprising 
result is known as the Kimura theorem . 

The previous result hold almost without change if all fit genotypes have 
the same fitness value, while unfit ones have a negligible one. Let us assume 
that a fraction x of the genotypes is unfit, and therefore practically unable 
to reproduce, and that fit and unfit genotypes are distributed at random on 
the hypercube. Therefore Nx neighbors of every fit genotype will be unfit 
on average. If a mutation appears, it will lead to an unfit genotype with 
probability x. It will be safer not to mutate, since one's parent is fit by 
definition. Therefore the effective mutation rate fi* will be smaller than the 
bare one /i, and given approximately by \i* ~ — x) 

The effective mutation rate /i* can only be nonzero if the clusters of fit 
genotypes span the hypercube, i.e., if it is possible to connect two arbitrarily 
different fit genotypes via a chain of fit mutants. Let us say that the fit 
genotype a belongs to the same cluster as the fit genotype a' if it differs in 
only one unit <7j either from a' or from a fit genotype which belongs to the 
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same cluster of a'. If x is small, there is a large cluster of fit genotypes which 
spans the hypercube: starting from any point on it, one can reach genotypes 
whose overlap with the initial point is arbitrarily small. However, when 
£>x*~l — 1/N, the space of fit genotypes breaks down into small clusters, 
and it is never possible to wander far away from the initial point stepping only 
on fit genotypes via single-point mutations. In this case memory of the initial 
genotype is maintained forever. This phenomenon is called the percolation 
threshold, and we see that it is closely related to the error threshold discussed 
in the previous section. We can introduce the order parameter 

9°o = fe^E(^(*)) 2 . ( 34 ) 



N 



where it is understood to take the process average with a fixed initial genotype 
<t(0). This order parameter is nonzero in the "trapped" regime x > x*. 

In the "wandering" regime x < x*, the population evolves in a neutral 
network, and is able to explore larger and larger regions of sequence space 
as time goes on |20|, |30|. At any given point, the number of fit mutants can 



be small, but the sequence space has a large connectivity, and the neutral 
network can efficiently span it. 

In a number of proteins one can relate the number of aminoacid substi- 
tutions in different taxa to their respective time of divergence, i.e., the time 
of the existence of their last common ancestor. One obtains a well-defined 
substitution rate which is specific of the protein, except for very conserva- 
tive proteins like histones ||. This suggests that proteins also evolve along 
neutral networks: unfit aminoacid substitutions are eliminated at each step, 
but fit substitutions, which are selectively neutral, are retained in accordance 
with the neutral theory f2c 



5 Two-parent reproduction and the origin of 
species 

Since most of the organisms which are close to us in daily life reproduce 
sexually, it is interesting to ask if the results of the previous section hold for 
a two-parent reproduction mechanism [[JT], The model can be defined 
in analogy with eq. (|2^) . One chooses, at each generation and for each 
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individual a, two parents a' and a". The genotype a a (t + 1) of the individual 
a in the new generation is then given (for i — 1,2, ... , N) by 



of(t + 1) = e?(t) [&{t)o?(t) + (1 - £*(*)) af'(t) 



(35) 



Here ef(t) = ±1 represents the effects of mutations as in the previous section, 
while £f (t) G {0, 1} determines from which parent our individual inherits the 
state of its unit af if £° = 0, from a'; otherwise, from a". One has £f (0 = §■ 
The genetic variability is again expressed in terms of the average genetic 
overlap 

A = Q=Jfi = ^[3. (36) 
One obtains for Q the equation 



Q = e" 4 ^ 



M 



— +4 1-— Q 



M 



(37) 



This equation implies Q = A/(l + A), as for the one-parent reproduction 
mechanism. Fluctuations are determined by the quantities 



B 
C 
D 



(Q 2 ) 



a a P 



Q 2 = (qY 



w t w t w j w ] 



(38) 
(39) 
(40) 



It is understood that different indices take on different values. We assume 
M — > oo, but 4/iM — > A -1 = const. In this limit, we have 



3B 



A 



+ 3j C 
AD 



C + 2D; 
A + 2D; 
2D + 2C + A. 



(41) 
(42) 
(43) 



These equations imply 



B = C = D 



:i+xr 



and therefore 



W = (Q 2 ) = (q) 



(44) 
(45) 
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Stated in other words, this result implies that in a population in which 
all possible pairs have the same probability of producing offspring (such pop- 
ulations are called panmictic), the genetic difference between any two indi- 
viduals has the same value with probability one. To be sure, this result has 
to be modified in actual populations: siblings are genetically more similar 
than strangers; however it is in agreement with the fact that specific genetic 
correlations are lost very rapidly as the genealogical relatedness decreases. 

Higgs and Derrida |L9| have taken advantage of this result to introduce a 



minimal model for the formation of biological species. They assume that a 
pair of individuals can produce offspring only if their genotypes are not too 
different: more explicitly, if their genetic overlap q is larger than a threshold 
go (fecundity threshold). As long as M and \i are such that Q (as obtained 
above) is larger than q , nothing happens; but if the population M, or the 
mutation rate \x is too large, the population splits into several subpopulations: 
the genetic overlap q of individuals belonging to the same subpopulation is 
larger than qo, whereas the overlap belonging to different subpopulations is 
smaller. Therefore, offspring can only be produced by pairs of individuals 
belonging to the same subpopulation. This is exactly the definition of the 
biological species, according to Mayr Q : 

Species are groups of actually or potentially interbreeding natural 
populations which are reproductively isolated from other such 
groups. 

The actual behavior of the population is extremely irregular: the size of the 
subpopulations fluctuates according to the irregularities in their sampling, 
much in the same way as family size in the one-parent reproduction mech- 
anism; and the value of the corresponding characteristic overlap fluctuates 
in agreement with the expression obtained above which relates the genetic 
overlap of a panmictic population to its size. From time to time, a subpop- 
ulation becomes too large, and the corresponding overlap hits the threshold. 
After a short period of "confusion" two new subpopulations (species) arise. 
The mutual overlap between different species evolves in time according to the 
exponential law that we have derived in the previous section. The process 
average of the overlap Q over the whole population is, rather remarkably, 
given by the same expression valid for a panmictic population ||19|| . There 
has not yet been an explanation of this result which appears very clearly in 
the populations. 
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The same model can be generalized to treat the effects of geographic 



isolation [pq| . One considers a population with reproduces with the same 
two-parent mechanism discussed above, but which is distributed in several 
"islands". Reproducing pairs can only be formed among individuals inhab- 
iting the same island, but before each "mating season" an individual can 
move from one island to a neighboring one with a small probability e. As a 
consequence, the overlap between individuals belonging to the same island 
is larger than between different islands. When the migration rate becomes 
so small that the average overlap between neighboring islands drops below 
the fecundity threshold, the system start behaving in the same irregular way 
as discussed above. Again, the average overlaps (either between islands or 
within one island) behave as in the absence of the fecundity threshold. 

There is an interesting phenomenon which takes place in rings of islands. 
One may reach a regime in which the average overlap between neighboring 
islands is above threshold, while that between islands further away is below. 
In this case it is possible to start from one islands and to move in one direc- 
tion in the ring, always finding mutually fecund populations in neighboring 
islands. However, coming back to the starting point, one finds a reproductive 
barrier, and possibly the coexistence, in the same islands, of two populations 
mutually sterile. This phenomenon appears rather frequently in the simula- 
tions |25| and can be related to the circular invasion phenomenon observed in 
natural populations f2~7f . For example, the northern skylark Larus argentatus 
exhibits a group of populations which are mutually interfecund if one starts 
from Scandinavia and moves towards the East, reaching, over Siberia, to 
North America. However, where the last American population overlaps with 
the first European one, they are no more mutually interfecund. Therefore 
the relation "being mutually interfecund" is not necessarily an equivalence 
relation. 

If one goes over to consider all-or-none selection in the presence of a 



two-parent reproduction mechanism 29 , one observes another peculiar phe- 



nomenon. The effective mutation rate fi* does not appear to depend on the 
fraction x of unfit genotypes, as long as x is small. When x crosses a thresh- 
old x* (which increases with the genome size N) the mutation rate drops 
suddenly and the average fitness of the population increases. This behav- 
ior can be understood in terms of collective adaptation, i.e., of the search 
for a region which optimizes the recombinational fitness of the individual, a 
quantity which measures also the likelihood that the offsprings of an indi- 
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vidual, obtained by mating with the other members of the population, are 
fit. A quantitative theory of this transition (which is rather striking in the 
simulations) is still lacking. 



6 Two-level selection and the maintenance of 
unselfish genes 

The last observation prompts us to consider situations in which the selection 
process leads to potential conflicts. One case, much studied in the literature, 
is the possible existence of unselfish genes, which determine, in the individual 
which expresses them, a behavior disadvantageous to the carrier, but benefi- 
cial to the group to which it belongs. While the existence of such genes has 
not yet been proved beyond doubt, it is an interesting problem in itself to see 
whether the interaction between the two selection levels, the individual and 
the group, can lead to the permanence of these genes in the population. Here 
I shall only report briefly a model which has been introduced and analyzed 



by R. Donato, M. Serva and myself |Ll], [Tl[ . 

We consider a population divided into groups (demes) of L individuals. 
Mating is only allowed within a group. Heredity acts according to the usual 
Mendelian mechanism. There is a behavioral locus with two alleles: a selfish 
(S) allele (dominant), and an unselfish (U) one (recessive). (It is easy to 
modify the model to consider more alleles, different reproduction schemes, 
etc.) The fitness of an individual depends on two factors: (i) if it is unselfish 
homozygote (UU) it is reduced, by a factor (1 — r), with respect to the other 
members of a group; (ii) if it belongs to a group with a large enough fraction 
x of UU-individuals (larger than a threshold x*), it is enhanced, by a factor 
(1 + c), with respect to groups which do not satisfy this condition. 

This definition can be summarized in the following table: 



Table 1: Fitness table 





X < X* 


X > X* 


Genotype: UU 


1 — r 


(l-r)(l + c) 


Other genotypes 


1 


1 + c 
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As a consequence of this selection scheme, the fraction x of unselfish 
individuals decreases within any given deme. On the other hand, denies 
with x > x* have higher average fitness and tend to expand at the expenses 
of the others. When a deme grows too large, it splits into two demes, and its 
members are redistributed at random between the two new demes. We can 
represent this process via a "deme fitness" A(x), proportional to the total 
fitness of individuals which form each deme, and given by 



In the limit of very large population (infinite number of demes) the process 
can be described by a quasispecies equation at the level of demes. Denoting 
by pt(x) the fraction of demes with a fraction x of unselfish individuals, we 
have 



In this equation, P(x <— x') is the conditional probability density to produce 
a deme with a fraction x of unselfish individuals, starting from one with a 
fraction x' . This probability contains two effects: (i) the systematic decrease 
of x, due to the disadvantage of altruism; (ii) the fluctuations due to random 
sampling, due to the finite deme size L. 

The quasispecies equation for the demes can be solved numerically for 
the steady state distribution. One finds a line (1 — r)(l + c) = f(L) which 
separates two regimes: when r is too large (or c is too small), the distribu- 
tion is peaked at x — 0, and therefore unselfish genes are wiped out of the 
population; otherwise, the distribution is nontrivial, and the average value 
of x is different from zero. The transition appears to be discontinuous. It 
is interesting to remark that it is the competition between demes that keeps 
unselfish genes in the population: there is no "optimal value" for x. Again, 
the "steady state" hides a complicated dynamical behavior: demes with high 
values of x increase in number, but their values of x decrease; however, new 
ones with high values of x arise from the splitting of old ones, and so on. 



Fitness as an individual property, in the way we have used here, is a powerful 
tool for model building. However, it is not measurable in the field, because 




(46) 




(47) 



7 Conclusions 
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the actual number of viable offspring of an individual is the outcome of its 
complex interaction with other members of the same population and with 
its environment. Some aspects of these interactions, from the point of view 
of evolutionary success, can be captured by the game theory approach [|32 



The most important aspect is the difference between local optimization, i.e., 
fitness optimization at the level of individual, and global optimization, at the 
level of community. 

Community can be formed by members of the same species, or of different 
species. One of the key problems in understanding evolutionary innovation 
is the evolution of individuality i.e., of the organization of different units 
into a single integrated organism to which it is possible to assign a fitness. 
This is also the problem of the emergence of mutualism and can be the 
key point in the understanding of the evolution of multicellular organisms. 

Nevertheless the concept of fitness, with its strong aspects of "physical- 
ism" related to its similarity with energy, is a very convenient stepping stone 
to enter, as physicists, in the arena of evolutionary theory. 
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